智能论文笔记

Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models

Angel Felipe Magnossão de Paula , Roberto Fray da Silva , Ipek Baris Schlicht

分类：自然语言处理 | 人工智能 | 机器学习

2021-11-08

社交媒体的普及创造了仇恨言论和性别歧视等问题。社交媒体中性别歧视的识别和分类是非常相关的任务，因为它们允许建立更健康的社会环境。尽管如此，这些任务很挑战。这项工作提出了一种使用多语种和单晶的BERT和数据点转换和与英语和西班牙语分类的策略的系统来使用多语种和单语的BERT和数据点转换和集合策略。它在社交网络中的性别歧视的背景下进行了2021年（存在2021年）任务，由Iberian语言评估论坛（Iberlef）提出。描述了所提出的系统及其主要组件，并进行深入的超公数分析。观察到的主要结果是：（i）该系统比基线模型获得了更好的结果（多语种伯爵）; （ii）集合模型比单声道模型获得了更好的结果; （iii）考虑所有单独模型和最佳标准化值的集合模型获得了两个任务的最佳精度和F1分数。这项工作在两个任务中获得的第一名，最高的精度（任务1和任务2的0.658.780）和F1分数（对于任务1的任务1和F1-宏为0.780的F1二进制）。

translated by 谷歌翻译

Predição de Incidência de Lesão por Pressão em Pacientes de UTI usando Aprendizado de Máquina

Henrique P. Silva , Arthur D. Reys , Daniel S. Severo , Dominique H. Ruther , Flávio A. O. B. Silva , Maria C. S. S. Guimarães , Roberto Z. A. Pinto , Saulo D. S. Pedro , Túlio P. Navarro , Danilo Silva

分类：机器学习

2021-12-23

压力溃疡在ICU患者中具有很高的患病率，但如果以初始阶段识别，则可预防。在实践中，布拉登规模用于分类高风险患者。本文通过使用MIMIC-III V1.4中可用的数据调查了在电子健康中使用机器学习记录数据的使用。制定了两个主要贡献：评估考虑在住宿期间所有预测的模型的新方法，以及用于机器学习模型的新培训方法。结果与现有技术相比，表现出卓越的性能;此外，所有型号在精密召回曲线中的每个工作点都超过了Braden刻度。 - - les \〜oes por按\〜ao possuem alta preval \ ^ encia em pacientes de Uti e s \〜ao preven \'iveis ao serem endicidificadas em Est \'agios Iniciais。 na pr \'atica materiza-se a escala de braden para classifica \ c {c} \〜ao de pacientes em risco。 Este Artigo Investiga o Uso de Apenizado de M \'Aquina Em Dados de Registros Eletr \ ^ Onicos Para Este Fim，Parir Da Base dados Mimic-III V1.4。 s \〜ao feitas duas contribui \ c {c} \〜oes principais：uma nova abordagem para a avalia \ c {c} \〜ao dos modelos e da escala da escala de braden levando em conta todas作为predi \ c {c} \ 〜oes feitas ao longo das interna \ c {c} \〜oes，euro novo m \'etodo de treinamento para os modelos de aprendizo de m \'aquina。 os结果os overidos superam o estado da arte everifica-se que os modelos superam意义a escala de braden em todos oS pontos de Opera \ c {c} \〜〜ao da curva de precis \〜ao por sensibilidade。

translated by 谷歌翻译

UniMorph 4.0: Universal Morphology

Khuyagbaatar Batsuren , Omer Goldman , Salam Khalifa , Nizar Habash , Witold Kieraś , Gábor Bella , Brian Leonard , Garrett Nicolai , Kyle Gorman , Yustinus Ghanggo Ate

分类：自然语言处理

2022-05-07

通用形态（UNIMORPH）项目是一项合作的努力，可为数百种世界语言实例化覆盖范围的标准化形态拐角。该项目包括两个主要的推力：一种无独立的特征架构，用于丰富的形态注释，并以各种语言意识到该模式的各种语言的带注释数据的类型级别资源。本文介绍了过去几年对几个方面的扩张和改进（自McCarthy等人（2020年）以来）。众多语言学家的合作努力增加了67种新语言，其中包括30种濒危语言。我们已经对提取管道进行了一些改进，以解决一些问题，例如缺少性别和马克龙信息。我们还修改了模式，使用了形态学现象所需的层次结构，例如多肢体协议和案例堆叠，同时添加了一些缺失的形态特征，以使模式更具包容性。鉴于上一个UniMorph版本，我们还通过16种语言的词素分割增强了数据库。最后，这个新版本通过通过代表来自metphynet的派生过程的实例丰富数据和注释模式来推动将衍生物形态纳入UniMorph中。

translated by 谷歌翻译

Sidewalk Measurements from Satellite Images: Preliminary Findings

Maryam Hosseini , Iago B. Araujo , Hamed Yazdanpanah , Eric K. Tokuda , Fabio Miranda , Claudio T. Silva , Roberto M. Cesar Jr

分类：计算机视觉

2021-12-12

对行人基础设施，特别是人行道的大规模分析对人类以人为本的城市规划和设计至关重要。受益于通过纽约市开放数据门户提供的Procepetric特征和高分辨率OrthoImages的丰富数据集，我们培养计算机视觉模型来检测遥感图像的人行道，道路和建筑物，达到83％的Miou持有-out测试集。我们应用形状分析技术来研究提取的人行道的不同属性。更具体地，我们对人行道的宽度，角度和曲率进行了瓷砖明智的分析，除了它们对城市地区的可行性和可达性的一般影响，众所周知，在轮椅用户的移动性中具有重要作用。初步结果是有前途的，瞥见了不同城市采用的拟议方法的潜力，使研究人员和从业者可以获得更生动的行人领域的画面。

translated by 谷歌翻译

Panoptic Segmentation Meets Remote Sensing

Osmar Luiz Ferreira de Carvalho , Osmar Abílio de Carvalho Júnior , Cristiano Rosa e Silva , Anesmar Olino de Albuquerque , Nickolas Castro Santana , Dibio Leandro Borges , Roberto Arnaldo Trancoso Gomes , Renato Fontes Guimarães

分类：计算机视觉 | 人工智能

2021-11-23

Panoptic semonation组合实例和语义预测，允许同时检测“事物”和“东西”。在许多具有挑战性的问题中有效地接近远程感测的数据中的Panoptic分段可能是吉祥的，因为它允许连续映射和特定的目标计数。有几个困难阻止了遥感中这项任务的增长：（a）大多数算法都设计用于传统图像，（b）图像标签必须包含“事物”和“填写”类，并且（c）注释格式复杂。因此，旨在解决和提高遥感中Panoptic分割的可操作性，这项研究有五个目标：（1）创建一个新的Panoptic分段数据准备管道，（2）提出注释转换软件以产生Panoptic注释; （3）在城市地区提出一个小说数据集，（4）修改任务的Detectron2，（5）评估城市环境中这项任务的困难。我们使用的空中图像，考虑14级，使用0,24米的空间分辨率。我们的管道考虑了三个图像输入，所提出的软件使用点Shapefile来创建Coco格式的样本。我们的研究生成了3,400个样本，具有512x512像素尺寸。我们使用了带有两个骨干板（Reset-50和Reset-101）的Panoptic-FPN，以及模型评估被视为语义实例和Panoptic指标。我们获得了93.9,47.7和64.9的平均iou，box ap和pq。我们的研究提出了一个用于Panoptic Seation的第一个有效管道，以及用于其他研究人员的广泛数据库使用和处理需要彻底了解的其他数据或相关问题。

translated by 谷歌翻译

Deep Learning for Space Weather Prediction: Bridging the Gap between Heliophysics Data and Theory

John C. Dorelli , Chris Bard , Thomas Y. Chen , Daniel Da Silva , Luiz Fernando Guides dos Santos , Jack Ireland , Michael Kirk , Ryan McGranaghan , Ayris Narock , Teresa Nieves-Chinchilla

分类：机器学习

2022-12-27

Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.

translated by 谷歌翻译

Can a Robot Shoot an Olympic Recurve Bow? A preliminary study

Guilherme Christmann , Lin Yu-Ren , Rodrigo da Silva Guerra , Jacky Baltes

分类：机器人

2022-12-21

The field of robotics, and more especially humanoid robotics, has several established competitions with research oriented goals in mind. Challenging the robots in a handful of tasks, these competitions provide a way to gauge the state of the art in robotic design, as well as an indicator for how far we are from reaching human performance. The most notable competitions are RoboCup, which has the long-term goal of competing against a real human team in 2050, and the FIRA HuroCup league, in which humanoid robots have to perform tasks based on actual Olympic events. Having robots compete against humans under the same rules is a challenging goal, and, we believe that it is in the sport of archery that humanoid robots have the most potential to achieve it in the near future. In this work, we perform a first step in this direction. We present a humanoid robot that is capable of gripping, drawing and shooting a recurve bow at a target 10 meters away with considerable accuracy. Additionally, we show that it is also capable of shooting distances of over 50 meters.

translated by 谷歌翻译

Extractive Text Summarization Using Generalized Additive Models with Interactions for Sentence Selection

Vinícius Camargo da Silva , João Paulo Papa , Kelton Augusto Pontara da Costa

分类：自然语言处理 | 机器学习

2022-12-21

Automatic Text Summarization (ATS) is becoming relevant with the growth of textual data; however, with the popularization of public large-scale datasets, some recent machine learning approaches have focused on dense models and architectures that, despite producing notable results, usually turn out in models difficult to interpret. Given the challenge behind interpretable learning-based text summarization and the importance it may have for evolving the current state of the ATS field, this work studies the application of two modern Generalized Additive Models with interactions, namely Explainable Boosting Machine and GAMI-Net, to the extractive summarization problem based on linguistic features and binary classification.

translated by 谷歌翻译

Federated Learning Using Three-Operator ADMM

Shashi Kant , José Mairton B. da Silva Jr. , Gabor Fodor , Bo Göransson , Mats Bengtsson , Carlo Fischione

分类：机器学习

2022-11-08

Federated learning (FL) has emerged as an instance of distributed machine learning paradigm that avoids the transmission of data generated on the users' side. Although data are not transmitted, edge devices have to deal with limited communication bandwidths, data heterogeneity, and straggler effects due to the limited computational resources of users' devices. A prominent approach to overcome such difficulties is FedADMM, which is based on the classical two-operator consensus alternating direction method of multipliers (ADMM). The common assumption of FL algorithms, including FedADMM, is that they learn a global model using data only on the users' side and not on the edge server. However, in edge learning, the server is expected to be near the base station and have direct access to rich datasets. In this paper, we argue that leveraging the rich data on the edge server is much more beneficial than utilizing only user datasets. Specifically, we show that the mere application of FL with an additional virtual user node representing the data on the edge server is inefficient. We propose FedTOP-ADMM, which generalizes FedADMM and is based on a three-operator ADMM-type technique that exploits a smooth cost function on the edge server to learn a global model parallel to the edge devices. Our numerical experiments indicate that FedTOP-ADMM has substantial gain up to 33\% in communication efficiency to reach a desired test accuracy with respect to FedADMM, including a virtual user on the edge server.

translated by 谷歌翻译

A Machine Learning Approach for DeepFake Detection

Gustavo Cunha Lacerda , Raimundo Claudio da Silva Vasconcelos

分类：计算机视觉

2022-09-28

随着深层技术的传播，这项技术变得非常易于访问和足够好，以至于对其恶意使用感到担忧。面对这个问题，检测锻造面孔对于确保安全和避免在全球和私人规模上避免社会政治问题至关重要。本文提出了一种使用卷积神经网络检测深击的解决方案，并为此目的开发了一个数据集-celeb -df。结果表明，在这些图像的分类中，总体准确性为95％，提出的模型接近于最新的现状，并且可以调整未来出现的操纵技术的可能性。。

translated by 谷歌翻译